Search CORE

4 research outputs found

An Information Retrieval Test Collection for English SMS Conversations

Author: Sankepally Rashmi
Publication venue
Publication date: 01/01/2015
Field of study

Information retrieval research for informal conversational settings differs in important ways from the more traditional goal of document retrieval. The goal of this research is to build an information retrieval test collection from informal conversational messages and to demonstrate the use of that collection to compare the retrieval effectiveness of some information retrieval systems. The test collection is based on the Linguistic Data Consortium's collection of more than 8,000 English SMS (Short Message Service) conversations, which contain more than 120,000 individual messages. The collection is described, followed by a description of the processes for creating and collecting topics, performing relevance judgments, and establishing baseline results. The findings indicate that traditional approaches for building information retrieval test collections can reasonably be applied to preclustered SMS conversations, but that the process of creating relevance judgments is somewhat more challenging and thus the reliable detection of differences in system effectiveness is somewhat more complex

Digital Repository at the University of Maryland

Using Zero-Resource Spoken Term Discovery for Ranked Retrieval

Author: Aren Jansen
Douglas W. Oard
Jerome White
Jiaul Paik
Rashmi Sankepally
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Research on ranked retrieval of spoken con-tent has assumed the existence of some auto-mated (word or phonetic) transcription. Re-cently, however, methods have been demon-strated for matching spoken terms to spoken content without the need for language-tuned transcription. This paper describes the first application of such techniques to ranked re-trieval, evaluated using a newly created test collection. Both the queries and the collection to be searched are based on Gujarati produced naturally by native speakers; relevance assess-ment was performed by other native speak-ers of Gujarati. Ranked retrieval is based on fast acoustic matching that identifies a deeply nested set of matching speech regions, cou-pled with ways of combining evidence from those matching regions. Results indicate that the resulting ranked lists may be useful for some practical similarity-based ranking tasks.

CiteSeerX

Crossref

CADET: Computer Assisted Discovery Extraction and Translation

Author: Burchfield Deana
Carrell Annabelle
Chaloux Julianne
Chen Tongfei
Comerford Alex
Costello Cash
Dredze Mark
Duh Kevin
Finin Tim
Glass Benjamin
Hao Shudong
Harman Craig
Koehn Philipp
Lawrie Dawn
Lippincott Tom
Martin Patrick
May Chandler
Mayfield James
Miller Scott
Poliak Adam
Rastogi Pushpendre
Sankepally Rashmi
Thomas Max
Tran Ying-Ying
Van Durme Benjamin
Wolfe Travis
Zhang Ted
Publication venue
Publication date: 01/12/2017
Field of study

Edinburgh Research Explorer

The FIRE 2013 question answering for the spoken web task. Forum for Information Retrieval Evaluation

Author: Aren Jansen
Douglas W Oard
Hltcoe
Jerome White
Jiaul Paik
Johns Hopkins
Rashmi Sankepally
Publication venue
Publication date: 01/01/2013
Field of study

ABSTRACT The FIRE 2013 Question Answering for the Spoken Web (QASW) task was an information retrieval evaluation in which the goal was to match spoken Gujarati questions to spoken Gujarati answers. This paper describes the design of the task, the development of the test collection, the runs that were submitted, and the corresponding results

CiteSeerX